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ABSTRACT 

Data on the intrinsic characteristics of an 

educational program are essential in pilot test ing new programs in 
order to determine how successfully design components have achieved 
their intended purpose. At the outset of an evaluation of ^ such data, 
it is necessary to define the evaluator's role and determine what 
intrinsic information will be required, when it will be needed^ who 
can provide it^ and how it will be collected. The Adjusted Agreement 
Index (AAi) was used in a case study to summarize data from a ^ complex 
prograni developed by the Institute of Canadian Bankers, in which 
eight university-level distance education courses were simultaneously 
pilot-tested for. 26 weeks with over 1,000 Canadian students. The AAI 
score depicts relative agreement by respondents and is calculated by 
subtracting the percentage of respbndents_who disagree with a 
statement, from the percentage who agree. Results indicate that the 
AAI is a helpful tool in sumitiar izing results clearly so that 
decision -makers willknbw which intrinsic characteristics require 
immediate attention for revision purposes. The AAI is easy to 
calculate arid use with subjective respon se quest ionnai res . Two 
rejEerences are listed, and four figures illustrate evaluation design , 
AAI calculation, score di stribut ion for various AAI results, and 
student perceptions >f assignments by course. ihiW) 
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EVAtOftTiNG A COMPLEX PROGRAM: 



WHERE TO START AND HOW TO FINISH 



Lengthy questionnaires are frequent ly administered when 
assessing instructional materials. However, evaluators typically 
find themselves overwhelmed with data , and use only key questions 
in an "eyeball" fashion. This presentation of fers a case study 
of training evaluation, and methods utilized to overcome data 
interpretation problems. A unique "Ad justed Agreement Index" is 
discussed as an effective means of evaluating subjective 
responses with large samples. These techniques are useful for 
clear presentations to superiors, and content revision using 
questionnaires. 
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EVALUATING A COMPLEX PRCXSRAM: WHERE TO START AND BOW TO FINISH 



Often evaluators are called upon to provide information 
on the intrinsic characteristics of an education program; These 
non-perf orman^re data are essential in pilot testing new programs 
in order to determine how saccessfally design components have 
achieved their intended purpose (Scriven, 1967). The purpose of 
this paper is to explore some of the key questions to be answered 
in evaluating intrinsic data- and to illustrate how the Ad j us ted 
Agreement Index (AAI), which was developed by these authors, was 
used to summarize data in an easily presentable fashion. 

This study focuses on a complex program developed by 
the Institute of Canadian Bankers. Eight university level 
courses were pilot tested simultaneously: Accounting, Business 
Administration, Business Finance, Business Strategy, Communica- 
tions, Economics, Marketing and Organizational Behaviour. Each 
course was designed by a different professor. Four Canadian uni- 
versities were represented. All courses were designed for dis- 
tance education and all materials were administered directly from 
the Instittjte's head office in Montreal through nine Canadian 
Banks. Over 1,006 students throughout Canada were involveJ in 
the 26-week pilot test. 

WHERE TO START ArJ EVAfcUATI ON 

1. What'rs^fie evaluator's role? A key decision at a study's 

outset is to identify the role of the evaluator. By defini- 
tion, evaluation must reach a decision. What part will the 
evaluator play in making that decision? In our case, it was 
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decided that the evaluator's role was to provide information 
rather than make decisions based on that information; This 
arrangement thus follows Stuf f iebeam ' s Context, input. 
Process, Product approach to evaluation. (Stuf £ lebeam, 
1968) 

2, Mfai^i^^ype .of decision is heeded ? A second consider at ion 
concerns the type of evaluation decisions which are to be 
made« As our example was a pilot-test of a new program, 
decisions would be required on how to revise the program 
based upon how well the design led to its intended outcome. 
Thus, the focus was on identifying design flaws and delivery 
problems, constituting formative, rather than summative 
evaluation. • 

3, What intrinsic information is required? A third considera- 
tion concerns information. In our situation, the study 
focused on finding out who the students were, how well they 
linked up with the content, and how well the content itself 
was structured for effective instruction; Therefore, three 
information types were identified: personal data, delivery 
system, and teaching material; 

4, When -will-the informa^ioh be required? A fourth considera- 
tion is timing; In our case, the greatest initial concern 
was on how well, students linked up with content. Thus, in- 
formation on the success of the delivery system was required 
almost as soon as the program was initiated. More specifi- 
cally, were students receiving maiterial on time? What con- 
dition were materials in upon arrival? information on 
teaching materials was required at interim points throughout 

5 
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the 26^week term before various decisions were to be made 
regarding assignments^ revisions to lesson plaris^ or in 
interpretation of student difficulty with textbooks. 

5 . Who can provide information? A fifth consideration con- 
cerns sources of information. In our case, four sources 
Were identified: assignment correctors, program administra- 
tors> dropouts, and students sitting final exams. Assign- 
ment correctors were professors from the various univers i- 

t ies contracted by the Institute to grade assignments . Pro- 
gram administrators assisted in the delivery of materials 
through various banks in Canada; They were located in 
regional offices of Canadian banks, at the Institute's 
regional offices and at their head office in Montreal. 
Dropouts were defined as students who did not complete the 
first two assignments. Students sitting final exams Were 
those wiio actually made it through the 26 weeks and had 
registered for the final exam. 

6. How will the information be collected? A sixth considera- 
tion focuses on data collection. Two different types of in- 
formation gathering instruments were identified: question- 
naires, and small group interviews (focus groups). Ques- 
tionnaires were felt to be appropriate in obtaining informa- 
tion from ass ignment correctors , dropouts , and students sit- 
ting the final exams. With reference to dropouts, the ques- 
tionnaire could be distributed directly by assignment corr 
rectors because they had the most up-to-date class lists and 
information on which students had not turned in their 
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assignments. The other two groups could be contacted 
directly by the Institute. Focus group interviews were felt 
to be appropriate for gathering information from program 
administrators because of their small numbers, their geo- 
graphical diversity and the need for immediate information 
for decision-making . 

The overall evaluation design is represented in Figure 1. 
This particular design was agreed upon before the first student 
enrolled in the program because it was felt that if all contribu- 
tors to the program assisted in designing the evaluation study, 
they would have a vested interest in acting on evaluation re- 
suits. Therefore, program administrators, course designers, 
evaluators, and instructors all met in a "strategy session" to 
map out information collection methods, time frames and informa- 
tion sources. 



INSERT FIGURE 1 HERE 
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HOW TO FINISH 

Evaluations frequently produce at least some of the 
above types of data^ but the results are seldbmly used properly. 
Even more dften^ results are employed in an ad hoc fashion to 
support decisions made while waiting for the data. The index 
described in this section is an attempt at making the collected 
data ''worth waiting for", i.e., interpretable . 

In this case study, information from the assignment 
collectors and program administrations was transparent and large- 
ly idiosyncratic. Dropouts provided useful systemic data, but is 
not dealt with here as it does not address the format-"ve question 
of instructional effectiveness. The following comment i are thus- ^ 
restricted to questionnaire results from students sitting final 
exams. The intent is to introduce the Adjusted Agreement index 
(AAlj which was developed to summarize data in an easily present- 
able fashion. 

in our exaraple, 311 final exam questionnaires were com- 
pleted out of 529, for a 59% return rate. The questionnaires 
centred on intrinsic data-exploring the effectiveness of the 
overall course, textbooks, lesson guides, and assignments. The 
case does not distinguish between those who successfully com- 
pleted their exams and those who did not. Given these data, the 
evaluator's challenge was to summarize clearly in order to facil- 
itate administrators' understand ing . 

Fifty-five s/tatements were followed by 7-pbint Likert- 
type scales. Positions on the scale ranged from Strongly Agree, 
Neither Agree nor Disagree (neutral or no opinion), to Strongly 
Disagree . 

s 
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The AAI depicts relative agreement by respondents and 
is calculated by subtracting the percentage of respondents who 
disagree with a statement (checked 6 or 7 on the scale) from the 
percentaige who agree (checked 1 or 2). This difference is the 
. r ^^^^^^ Figure 2, 30% strongly agreed with the statement^ 

o ouiongly disagreed, and 50% neither agreed nor disagreed. 
The AAI is calculated to be 10. 



INSERT FIGURE 2 HERE 



As i 1 lust rated in Figure 3 , the AAI is tied to the 
distribution of responses. In the first example^ the distribu- 
tion is ' skewed towards the agreement side of the scale. The 
result is a high positive AAI, indicating greater overall agree- 
ment with the statement. in the second example^ the distribution 
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is skewed toward the right , yielding a high negative ftAi . This 
translates e.s greater overall disagreement with the statementi 
Examples 3 and 4 indicate normal and bi-modal distributions, both 
of which yield a low positive or negative AAi, This lower ftfti 
score represents lesser overall agreement or disagreement with 
the statement and in essence says that most respondents are 
neutral or are mixed in their opinions . 



INSERT FIGURE 3 HERE 



A simple decision making rule was established for pur- 
poses of identifying components requiring the earliest atten- 
tion: priority decisions are indicated by relatively high 
deviations from desirable AAi scores; 

Figure 4 illustrates how the decision rale was applied 
to course assignments. The statement which said that there were 
an adequate number of assignments yielded a considerable variety 
of AAi scores. Business Finance, with a score of 70 indicates 
that a high percentage of students agreed with the statement (100 
would be perfect, though highly unlikely). On the other hand, 
both Accounting and Business Administration yielded AAI scores of 
18r indicating considerably less agreement over the number of 
assignments. The priority decision was to determine whether 
Accounting and Business Administration required additional 
assignments. 



INSERT FIGORE 4 HERE 
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As another example, statements which dealt specifically 
with overall structure were pooled. Great care must be taken to 
combine only statements addressing the exact same issue. All the 
courses revealed relatively high AAi scores, varying from 43 to 
61. This coul.d be interpreted to mean that the overall 
structure of assignments was consider' 1 to be satisfactory as 
none deviated significantly from the desired level, and no 
immediate action was required. 

Five statements on the questionnaire dealt with ease in 
completing assignments. Students in all eight courses indicated 
their disagreement as reflected in the negative AAI scores. 
Economics, with an AAI of -42, is the one course where students 
had the greatest difficulty completing assignments. Accounting, 
with an AAI of -1, suggests the least difficulty in completing 
assignments. In this instance, the priority decision was to 
determine why Economics assignments were perceived as being too 
hard. Other courses would be addressed in order of the severity 
of the problem, the number of students effected, the difficulty 
of revising, and so on. 

In our final example, statements concerning the receipt 
of good feedback yielded both positive and negative AAI scores. 
Students in Business Finance and Communications indicated through 
their AAI scores that they received relatively good feedback. 
Students in Accounting and Business Strategy^ oh the other hand^ 
indicated they were less happy with feedback. The priority 
decision was to examine feedback quality in Accounting arJ 
Bus iness Strategy . 
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when AAI scores within a course consistently deviate 
frditi an acceptablis level, isuch as in Accounting or Business 
Administration, that course should be examinisd inore thoroughly^ 
The level of detail provided by the evaluation questionnaire and 
performance data will further guide revision. The All is 
extremely effective at alerting the designer of problems, but 
must always be considered with other data when available. For 
example, if the Economics assignments were very difficult, but 
produced excellent performance results and generally satisfied 
students (the "pain" is necessary for learning), then revision is 
less critical, or perhaps even undesirable. 

The AAI is thus a relative score based on d istribu- 
tion. It allows direct and easy comparison of each aspect 
measured. The comparison can be made both with the ideal AAI 
level, and with other courses/programs where available • One 
weakness of the AAI is that it is not an absolute score, (a low 
score on one aspect may be a high score on another aspect)/ thus 
potentially confusing to non-users. However^ direct comparisons 
should dissipate such problems. The AAI is also not very useful 
if the quantity of data is small. However, for a complex program 
such as the example described above , the AAI successfully dis- 
tinguished areas requiring immediate attention. Finally, the use 
of the AAI requires .the evaluator to conscientiously attend the 
to rest^onse distribution. A bimodal distribution may produce a 
nondescript AAI, but suggest important differences within the 
target population, such as prior knowledge or language. The AAI 
fails to independently alert the user to this type of problem^ 
but so do all other statistics. 

o .. 13 
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A final word is necessary regarding standard descrip- 
tive statistics. The mean, median and mode all reflect central 
tendencies, or averages, whereas an evaluator is often interested 
in extremes, or deviations from the norm. Percentages, when left 
untreated, are even more relativistic and oper. to interpretation 
than the ftAi ; The AAi, in fact, forces the user to recognize 
both the value of deviation scores, and the limitations inherent 
in their non- absolute level of interpretation. in other words , 
they are easy to interpret only when tne criterion for establish- 
ing a meaning is supplied. This guards against both confusion 
and mis interpret at ion . 

In summary/ intrinsic characteristics of an educational 
program affect how well the program achieves its intended pur- 
pose. After defining the evaluator's role, it is necessary at a 
study's outset to identify what information will be required, 
when it will be needed, who can provide the information, and how 
the information will be collected. The Adjusted Agreement Index 
is a helpful tool in summarizing results clearly so that deci- 
sion-makers will know which intrinsic characteristics recjuire 
immediate attention for revision purposes; This case study 
involving eight courses designed for distance education suggests 
that the AAI is easy to calculate and use with subjective 
response quest ionna ires . 
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FIGURE 2 
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